AITopics

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Russia (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsFeb-9-2026, 21:29:44 GMT

b20706935de35bbe643733f856d9e5d6-Supplemental.pdf

approximation, dz 0, likelihood, (16 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.48)

Neural Information Processing SystemsOct-10-2025, 15:08:28 GMT

Optimal Flow Matching: Learning Straight Trajectories in Just One Step

However, such processes usually have curved trajectories, resulting in time-consuming ODE integration for sampling.

flow matching, ofm, trajectory, (14 more...)

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Russia (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsAug-15-2025, 21:07:32 GMT

b20706935de35bbe643733f856d9e5d6-Supplemental.pdf

approximation, dz 0, likelihood, (16 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.48)

arXiv.org Machine LearningSep-17-2024

Survey of Data-driven Newsvendor: Unified Analysis and Spectrum of Achievable Regrets

Chen, Zhuoxin, Ma, Will

In the Newsvendor problem, the goal is to guess the number that will be drawn from some distribution, with asymmetric consequences for guessing too high vs. too low. In the data-driven version, the distribution is unknown, and one must work with samples from the distribution. Data-driven Newsvendor has been studied under many variants: additive vs. multiplicative regret, high probability vs. expectation bounds, and different distribution classes. This paper studies all combinations of these variants, filling in many gaps in the literature and simplifying many proofs. In particular, we provide a unified analysis based on the notion of clustered distributions, which in conjunction with our new lower bounds, shows that the entire spectrum of regrets between $1/\sqrt{n}$ and $1/n$ can be possible.

algorithm, artificial intelligence, machine learning, (16 more...)

2409.03505

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Nishiyama, Sota, Ohzeki, Masayuki

Solution space and storage capacity of fully connected two-layer neural networks with generic activation functions

arXiv.org Machine LearningApr-20-2024

The storage capacity of a binary classification model is the maximum number of random input-output pairs per parameter that the model can learn. It is one of the indicators of the expressive power of machine learning models and is important for comparing the performance of various models. In this study, we analyze the structure of the solution space and the storage capacity of fully connected two-layer neural networks with general activation functions using the replica method from statistical physics. Our results demonstrate that the storage capacity per parameter remains finite even with infinite width and that the weights of the network exhibit negative correlations, leading to a 'division of labor'. In addition, we find that increasing the dataset size triggers a phase transition at a certain transition point where the permutation symmetry of weights is broken, resulting in the solution space splitting into disjoint regions. We identify the dependence of this transition point and the storage capacity on the choice of activation function. These findings contribute to understanding the influence of activation functions and the number of parameters on the structure of the solution space, potentially offering insights for selecting appropriate architectures based on specific objectives.

activation function, neural network, storage capacity, (16 more...)

2404.13404

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
Asia > Japan > Honshū > Tōhoku > Miyagi Prefecture > Sendai (0.04)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Kornilov, Nikita, Gasnikov, Alexander, Korotin, Alexander

Optimal Flow Matching: Learning Straight Trajectories in Just One Step

arXiv.org Machine LearningMar-19-2024

Over the several recent years, there has been a boom in development of flow matching methods for generative modeling. One intriguing property pursued by the community is the ability to learn flows with straight trajectories which realize the optimal transport (OT) displacements. Straightness is crucial for fast integration of the learned flow's paths. Unfortunately, most existing flow straightening methods are based on non-trivial iterative procedures which accumulate the error during training or exploit heuristic minibatch OT approximations. To address this issue, we develop a novel optimal flow matching approach which recovers the straight OT displacement for the quadratic cost in just one flow matching step.

flow matching, matching, optimal flow matching, (11 more...)

2403.13117

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Cardoso, Gabriel, Idrissi, Yazid Janati El, Corff, Sylvain Le, Moulines, Eric

Monte Carlo guided Diffusion for Bayesian linear inverse problems

arXiv.org Machine LearningOct-25-2023

Ill-posed linear inverse problems arise frequently in various applications, from computational photography to medical imaging. A recent line of research exploits Bayesian inference with informative priors to handle the ill-posedness of such problems. Amongst such priors, score-based generative models (SGM) have recently been successfully applied to several different inverse problems. In this study, we exploit the particular structure of the prior defined by the SGM to define a sequence of intermediate linear inverse problems. As the noise level decreases, the posteriors of these inverse problems get closer to the target posterior of the original inverse problem. To sample from this sequence of posteriors, we propose the use of Sequential Monte Carlo (SMC) methods. The proposed algorithm, MCGDiff, is shown to be theoretically grounded and we provide numerical simulations showing that it outperforms competing baselines when dealing with ill-posed inverse problems in a Bayesian setting.

algorithm, inverse problem, posterior, (16 more...)

2308.07983

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > France (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.48)
Health & Medicine > Health Care Technology (0.34)
Media > Photography (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Baldassi, Carlo, Zecchina, Riccardo

Efficiency of quantum versus classical annealing in non-convex learning problems

arXiv.org Machine LearningOct-16-2017

Quantum annealers aim at solving non-convex optimization problems by exploiting cooperative tunneling effects to escape local minima. The underlying idea consists in designing a classical energy function whose ground states are the sought optimal solutions of the original optimization problem and add a controllable quantum transverse field to generate tunneling processes. A key challenge is to identify classes of non-convex optimization problems for which quantum annealing remains efficient while thermal annealing fails. We show that this happens for a wide class of problems which are central to machine learning. Their energy landscapes is dominated by local minima that cause exponential slow down of classical thermal annealers while simulated quantum annealing converges efficiently to rare dense regions of optimal solutions.

artificial intelligence, configuration, machine learning, (19 more...)

1706.0847

Country: Europe > Italy (0.46)

Genre: Research Report > New Finding (0.92)

Industry: Education > Focused Education > Special Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Baldassi, Carlo, Gerace, Federica, Lucibello, Carlo, Saglietti, Luca, Zecchina, Riccardo

Learning may need only a few bits of synaptic precision

arXiv.org Machine LearningMay-27-2016

Learning in neural networks poses peculiar challenges when using discretized rather then continuous synaptic states. The choice of discrete synapses is motivated by biological reasoning and experiments, and possibly by hardware implementation considerations as well. In this paper we extend a previous large deviations analysis which unveiled the existence of peculiar dense regions in the space of synaptic states which accounts for the possibility of learning efficiently in networks with binary synapses. We extend the analysis to synapses with multiple states and generally more plausible biological features. The results clearly indicate that the overall qualitative picture is unchanged with respect to the binary case, and very robust to variation of the details of the model. We also provide quantitative results which suggest that the advantages of increasing the synaptic precision (i.e.~the number of internal synaptic states) rapidly vanish after the first few bits, and therefore that, for practical applications, only few bits may be needed for near-optimal performance, consistently with recent biological findings. Finally, we demonstrate how the theoretical analysis can be exploited to design efficient algorithmic search strategies.

artificial intelligence, machine learning, natural language, (18 more...)

doi: 10.1103/PhysRevE.93.052313

1602.04129

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.87)